Educative: Interactive Courses for Software Developers

The solution that appeals to some programmers when they need to support variable attributes is to create a second table, storing attributes as rows. See the diagram below showing the two tables.

Each row in this attribute table has three columns:

The Entity: Typically this is a foreign key to a parent table that has one row per entity.
The Attribute: This is simply the name of a column in a conventional table, but we have to identify the attribute on each given row.
The Value: Each entity has a value for each of its attributes.

For example, a given bug is an entity we identify by its primary key value 1234. It has an attribute called status. The value of that attribute for bug 1234 is NEW.

This design is called Entity-Attribute-Value (EVA). It’s also sometimes called open schema, schemaless, or name-value pairs.

Let’s create the Issues table and the IssueAttributes table by using the following code.

Enter to Rename, Shift+Enter to Preview

Creating Issues and IssueAttributes tables

After creating the tables, let’s insert the data as given.

INSERT INTO Issues (issue_id) VALUES (1234);
INSERT INTO IssueAttributes (issue_id, attr_name, attr_value)
VALUES
(1234, 'product', '1'),
(1234, 'date_reported', '2009-06-01'),
(1234, 'status', 'NEW'),
(1234, 'description', 'Saving does not work'),
(1234, 'reported_by', 'Bill'),
(1234, 'version_affected', '1.0'),
(1234, 'severity', 'loss of functionality'),
(1234, 'priority', 'high');

Enter to Rename, Shift+Enter to Preview

Inserting data into Issues and IssueAttributes tables

Test it yourself in the following playground.

CREATE TABLE Issues (
issue_id SERIAL PRIMARY KEY
);
CREATE TABLE IssueAttributes (
issue_id BIGINT UNSIGNED NOT NULL,
attr_name VARCHAR(100) NOT NULL,
attr_value VARCHAR(100),
PRIMARY KEY (issue_id, attr_name),
FOREIGN KEY (issue_id) REFERENCES Issues(issue_id)
);
INSERT INTO Issues (issue_id) VALUES (1234);
INSERT INTO IssueAttributes (issue_id, attr_name, attr_value)
VALUES
(1234, 'product', '1'),
(1234, 'date_reported', '2009-06-01'),
(1234, 'status', 'NEW'),
(1234, 'description', 'Saving does not work'),
(1234, 'reported_by', 'Bill'),
(1234, 'version_affected', '1.0'),
(1234, 'severity', 'loss of functionality'),
(1234, 'priority', 'high');
SELECT * FROM IssueAttributes;

Enter to Rename, Shift+Enter to Preview

Retrieving data from IssueAttributes

By adding one additional table, we seem to gain the following benefits:

Both tables have a few columns.
The number of columns doesn’t need to grow to support new attributes.
We avoid a clutter of columns that contain null in rows where the attribute is inapplicable.

This appears to be an improved design. However, the simple database structure doesn’t make up for the difficulty of using it.

Querying an attribute#

Imagine that your boss needs to run a report of the bugs reported per day. In a conventional table design, the Issues table would have a simple attribute column such as date_reported. The statement for creating the conventional Issues table is given below:

Enter to Rename, Shift+Enter to Preview

Creating conventional Issues table

You can insert the data in the Issues table, as follows:

Enter to Rename, Shift+Enter to Preview

Inserting data into Issues table

To query all bugs with their report dates, your boss could use a simple query like this:

Enter to Rename, Shift+Enter to Preview

Querying all bugs with their report dates

To get the same information as the previous query using the EAV design, your boss needs to fetch rows from the IssueAttributes table that stores an attribute named by the string date_reported. This query is more verbose but less clear.

Enter to Rename, Shift+Enter to Preview

Querying date_reported in EAV design

Reconstructing a row#

It’s natural to retrieve a row from the Issues table with all its attributes in columns. We want to fetch an issue in a single row as though it were stored in a conventional table. We would like to view the records like in the following table.

issue_id	date_reported	status	priority	description
1234	2009-06-01	NEW	HIGH	Saving does not work

Because each attribute is stored on a separate row of the IssueAttributes table, retrieving them all as part of a single row requires a JOIN for each attribute. We must know all attributes at the time we write this query. The following query reconstructs the row shown earlier:

SELECT i.issue_id,
i1.attr_value AS "date_reported",
i2.attr_value AS "status",
i3.attr_value AS "priority",
i4.attr_value AS "description"
FROM Issues AS i
LEFT OUTER JOIN IssueAttributes AS i1
ON i.issue_id = i1.issue_id AND i1.attr_name = 'date_reported'
LEFT OUTER JOIN IssueAttributes AS i2
ON i.issue_id = i2.issue_id AND i2.attr_name = 'status'
LEFT OUTER JOIN IssueAttributes AS i3
ON i.issue_id = i3.issue_id AND i3.attr_name = 'priority'
LEFT OUTER JOIN IssueAttributes AS i4
ON i.issue_id = i4.issue_id AND i4.attr_name = 'description'
WHERE i.issue_id = 1234;

Enter to Rename, Shift+Enter to Preview

Retrieving data as it were stored in a conventional table

We must use OUTER JOIN operations because INNER JOIN operations would cause the query to return no rows if any of the attributes were not present in the IssueAttributes table. As the number of attributes increases, so does the number of JOIN operations, and the cost of this query increases exponentially.

However, if we need a query to support such a high number of JOIN operations that we’re concerned about exceeding the database’s limits, we may have a problem with our database design. It’s common for an EAV design to lead to this problem.

Introduction

Logical Antipattern - Jaywalking

Logical Antipattern - Naive Trees

Logical Antipattern - ID Required

Logical Antipattern - Keyless Entry

Assessment: Logical Antipatterns - Part 1

Logical Antipattern - Entity-Attribute-Value

Logical Antipattern - Polymorphic Associations

Logical Antipattern - Multicolumn Attributes

Logical Antipattern - Metadata Tribbles

Assessment: Logical Antipatterns - Part 2

Physical Antipattern - Rounding Errors

Physical Antipattern - 31 Flavors

Physical Antipattern - Phantom Files

Physical Antipattern - Index Shotgun

Assessment: Physical Antipatterns

Query Antipattern - Fear of the Unknown

Query Antipattern - Ambiguous Groups

Query Antipattern - Random Selection

Query Antipattern - Poor Man’s Search Engine

Query Antipattern - Spaghetti Query

Query Antipattern - Implicit Columns

Assessment: Query Antipattern

Application Development Antipattern - Readable Passwords

Application Development Antipattern - SQL Injection

Application Development Antipattern - Pseudokey Neat-Freak

Application Development Antipattern - See No Evil

Application Development Antipattern - Diplomatic Immunity

Application Development Antipattern - Magic Beans

Assessment: Application Development Antipattern

Conclusion

Antipattern: Use a Generic Attribute Table

Querying an attribute#

Reconstructing a row#